SeqGL Identifies Context-Dependent Binding Signals in Genome-Wide Regulatory Element Maps

نویسندگان

  • Manu Setty
  • Christina S. Leslie
چکیده

Genome-wide maps of transcription factor (TF) occupancy and regions of open chromatin implicitly contain DNA sequence signals for multiple factors. We present SeqGL, a novel de novo motif discovery algorithm to identify multiple TF sequence signals from ChIP-, DNase-, and ATAC-seq profiles. SeqGL trains a discriminative model using a k-mer feature representation together with group lasso regularization to extract a collection of sequence signals that distinguish peak sequences from flanking regions. Benchmarked on over 100 ChIP-seq experiments, SeqGL outperformed traditional motif discovery tools in discriminative accuracy. Furthermore, SeqGL can be naturally used with multitask learning to identify genomic and cell-type context determinants of TF binding. SeqGL successfully scales to the large multiplicity of sequence signals in DNase- or ATAC-seq maps. In particular, SeqGL was able to identify a number of ChIP-seq validated sequence signals that were not found by traditional motif discovery algorithms. Thus compared to widely used motif discovery algorithms, SeqGL demonstrates both greater discriminative accuracy and higher sensitivity for detecting the DNA sequence signals underlying regulatory element maps. SeqGL is available at http://cbio.mskcc.org/public/Leslie/SeqGL/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discover context-specific combinatorial transcription factor interactions by integrating diverse ChIP-Seq data sets

Combinatorial interactions among transcription factors (TFs) are critical for integrating diverse intrinsic and extrinsic signals, fine-tuning regulatory output and increasing the robustness and plasticity of regulatory systems. Current knowledge about combinatorial regulation is rather limited due to the lack of suitable experimental technologies and bioinformatics tools. The rapid accumulatio...

متن کامل

Genome-wide mechanisms of nuclear receptor action.

Nuclear receptors are involved in a myriad of physiological processes, responding to ligands and binding to DNA at sequence-specific cis-regulatory elements. This binding occurs in the context of chromatin, a critical factor in regulating eukaryotic transcription. Recent high-throughput assays have examined nuclear receptor action genome-wide, advancing our understanding of receptor binding to ...

متن کامل

Allele-specific transcription factor binding in liver and cervix cells unveils many likely drivers of GWAS signals.

Genome-wide association studies (GWAS) point to regions with associated genetic variants but rarely to a specific gene and therefore detailed knowledge regarding the genes contributing to complex traits and diseases remains elusive. The functional role of GWAS-SNPs is also affected by linkage disequilibrium with many variants on the same haplotype and sometimes in the same regulatory element al...

متن کامل

CEAS: cis-regulatory element annotation system

The recent availability of high-density human genome tiling arrays enables biologists to conduct ChIP-chip experiments to locate the in vivo-binding sites of transcription factors in the human genome and explore the regulatory mechanisms. Once genomic regions enriched by transcription factor ChIP-chip are located, genome-scale downstream analyses are crucial but difficult for biologists without...

متن کامل

Calculating transcription factor binding maps for chromatin

Current high-throughput experiments already generate enough data for retrieving the DNA sequence-dependent binding affinities of transcription factors (TF) and other chromosomal proteins throughout the complete genome. However, the reverse task of calculating binding maps in a chromatin context for a given set of concentrations and TF affinities appears to be even more challenging and computati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2015